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Abstract 

The Fault- Tolerant Facility Placement problem (FTFP) is a generalization of the classic 
Uncapacitated Facihty Location Problem (UFL) . In FTFP we are given a set of facility sites 
and a set of clients. Opening a facility at site i costs f i and connecting client j to a facility 
at site i costs dij. We assume that the connection costs (distances) dij satisfy the triangle 
inequality Multiple facilities can be opened at any site. Each client j has a demand r^, 
which means that it needs to be connected to Vj different facihties (some of which could 
be located on the same site). The goal is to minimize the sum of facility opening cost and 
connection cost. 

The main result of this paper is a 1.575-approximation algorithm for FTFP, based on 
LP-rounding. The algorithm first reduces the demands to values polynomial in the number 
of sites. Then it uses a technique that we call adaptive partitioning, which partitions the 
instance by splitting clients into unit demands and creating a number of (not yet opened) 
facilities at each site. It also partitions the optimal fractional solution to produce a frac- 
tional solution for this new instance. The partitioned fractional solution satisfies a number 
of properties that allow us to exploit existing LP-rounding methods for UFL to round our 
partitioned solution to an integral solution, preserving the approximation ratio. In particu- 
lar, our 1.575-approximation algorithm is based on the ideas from the 1.575-approximation 
algorithm for UFL by Byrka et al., with changes necessary to satisfy the fault-tolerance 
requirement. 
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1. Introduction 

In the Fault- Tolerant Facility Placement problem (FTFP), we are given a set F of sites 
at wliicli facilities can be built, and a set C of clients with some demands that need to be 
satisfied by different facilities. A client j G C has demand rj. Building one facility at a site 
i e F incurs a cost fi, and connecting one unit of demand from client j to a facility at site i 
costs dij. Throughout the paper we assume that the connection costs (distances) dij form a 
metric, that is, they are symmetric and satisfy the triangle inequality. In a feasible solution, 
some number of facilities, possibly zero, are opened at each site i, and demands from each 
client are connected to those open facilities, with the constraint that demands from the same 
client have to be connected to different facilities. Note that any two facilities at the same 
site are considered different. 

It is easy to see that if all rj = 1 then FTFP reduces to the classic Uncapacitated Facility 
Location problem (UFL). If we add a constraint that each site can have at most one facility 
built on it, then the problem becomes equivalent to the Fault-Tolerant Facility Location 
problem (FTFL). One implication of the one-facility-per-site restriction in FTFL is that 
maxjgc'^j < while in FTFP the values of rj's can be much bigger than |F|. 

The UFL problem has a long history; in particular, great progress has been achieved in 
the past two decades in developing techniques for designing constant-ratio approximation 
algorithms for UFL. Shmoys, Tardos and Aardal [16] proposed an approach based on LP- 
rounding, that they used to achieve a ratio of 3.16. This was then improved by Chudak [S] to 
1.736, and later by Sviridenko [17\ to 1.582. The best known "pure" LP-rounding algorithm 
is due to Byrka et al. [3] with ratio 1.575. Byrka and Aardal [2] gave a hybrid algorithm that 
combines LP-rounding and dual-fitting (based on [lO]), achieving a ratio of 1.5. Recently, 
Li [13] showed that, with a more refined analysis and randomizing the scaling parameter 
used in [2], the ratio can be improved to 1.488. This is the best known approximation 
result for UFL. Other techniques include the primal-dual algorithm with ratio 3 by Jain and 
Vazirani [TT], the dual fitting method by Jain et al. [10] that gives ratio 1.61, and a local 
search heuristic by Arya et al. [1] with approximation ratio 3. On the hardness side, UFL is 
easily shown to be NP-hard, and it is known that it is not possible to approximate UFL in 
polynomial time with ratio less than 1.463, provided that NP ^ DTIME(n'^(^°siogn)^ 
observation by Sviridenko strengthened the underlying assumption to P 7^ NP (see |19j). 

FTFL was first introduced by Jain and Vazirani |12j and they adapted their primal-dual 
algorithm for UFL to obtain a ratio of 3 ln(maxjgc ^j)- All subsequently discovered constant- 
ratio approximation algorithms use variations of LP-rounding. The first such algorithm, by 
Guha et al. [7] , adapted the approach for UFL from [16] . Swamy and Shmoys [18] improved 
the ratio to 2.076 using the idea of pipage rounding introduced in [T7] . Most recently, Byrka et 
al. [4j improved the ratio to 1.7245 using dependent rounding and laminar clustering. 



FTFP is a natural generalization of UFL. It was first studied by Xu and Shen [20], who 
extended the dual-fitting algorithm from [10] to give an approximation algorithm with a ratio 
claimed to be 1.861. However their algorithm runs in polynomial time only if maXjgcTj is 
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polynomial in 0(|F| ■ |C|) and the analysis of the performance guarantee in [20] is flawecfl To 
date, the best approximation ratio for FTFP in the literature is 3.16, established by Yan and 
Chrobak [21], while the only known lower bound is the 1.463 lower bound for UFL from [6j, 
as UFL is a special case of FTFP. If all demand values rj are equal, the problem can be 
solved by simple scaling and applying LP-rounding algorithms for UFL. This does not affect 
the approximation ratio, thus achieving ratio 1.575 for this special case (see also |14j). 

The main result of this paper is an LP-rounding algorithm for FTFP with approximation 
ratio 1.575, matching the best ratio for UFL achieved via the LP-rounding method [3j and 
significantly improving our earlier bound in ^Tj . In Section[3]we prove that, for the purpose of 
LP-based approximations, the general FTFP problem can be reduced to the restricted version 
where all demand values are polynomial in the number of sites. This demand reduction trick 
itself gives us a ratio of 1.7245, since we can then treat an instance of FTFP as an instance 
of FTFL by creating a sufficient (but polynomial) number of facilities at each site, and then 
using the algorithm from [4] to solve the FTFL instance. 

The reduction to polynomial demands suggests an approach where clients' demands are 
split into unit demands. These unit demands can be thought of as "unit-demand clients", 
and a natural approach would be to adapt LP-rounding methods from [HI O |3] to this 
new set of unit-demand clients. Roughly, these algorithms iteratively pick a client that 
minimizes a certain cost function (that varies for different algorithms) and open one facility 
in the neighborhood of this client. The remaining clients are then connected to these open 
facilities. In order for this to work, we also need to convert the optimal fractional solution 
{x*, y*) of the original instance into a solution {x, y) of the modified instance which then can 
be used in the LP-rounding process. This can be thought of as partitioning the fractional 
solution, as each connection value x*j must be divided between the rj unit demands of client 
j in some way. In Section |4] we formulate a set of properties required for this partitioning 
to work. For example, one property guarantees that we can connect demands to facilities 
so that two demands from the same client are connected to different facilities. Then we 
present our adaptive partitioning technique that computes a partitioning with all the desired 
properties. Using adaptive partitioning we were able to extend the algorithms for UFL from 
[HI El 1^ to FTFP. We illustrate the fundamental ideas of our approach in Section [sj showing 
how they can be used to design an LP-rounding algorithm with ratio 3. In Section |6] we 
refine the algorithm to improve the approximation ratio to 1 + 2/e ~ 1.736. Finally, in 
Section [7], we improve it even further to 1.575 - the main result of this paper. 

Summarizing, our contributions are two-fold: One, we show that the existing LP-rounding 
algorithms for UFL can be extended to a much more general problem FTFP, retaining 
the approximation ratio. We believe that, should even better LP-rounding algorithms be 
developed for UFL in the future, using our demand reduction and adaptive partitioning 
methods, it should be possible to extend them to FTFP. In fact, some improvement of the 
ratio should be achieved by randomizing the scaling parameter 7 used in our algorithm, 
as Li showed in [T^^] for UFL. (Since the ratio 1.488 for UFL in [TH] uses also dual-fitting 
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algorithms [T3], we would not obtain the same ratio for FTFP yet using only LP-rounding.) 

Two, our ratio of 1.575 is significantly better than the best currently known ratio of 1.7245 
for the closely-related FTFL problem. This suggests that in the fault-tolerant scenario the 
capability of creating additional copies of facilities on the existing sites makes the problem 
easier from the point of view of approximation. 

2. The LP Formulation 

The FTFP problem has a natural Integer Programming (IP) formulation. Let Ui represent 
the number of facilities built at site i and let Xij represent the number of connections from 
client j to facilities at site i. If we relax the integrality constraints, we obtain the following 
LP: 



minimize cost{x, y) = Y.^^^ fiyi + Y.ie¥,jec ^ij^ij 
subject to Hi — Xij > Vi G F, j G C 

Xij>0,yi>0 yieWjeC 



The dual program is: 

maximize ^^gc'^jaj 
subject to Xljec f^ij ^ 
aj - f3ij < 
oij>0, 13 ij > 



(2) 

fi V2 G F 

dij ViGF,jGC 

ViGF,jGC 



In each of our algorithms we will fix some optimal solutions of the LPs ([I]) and (|2]) that 
we will denote by {x*,y*) and {a*,f3*), respectively. 

With {x*,y*) fixed, we can define the optimal facility cost as F* = J2ie¥ f^Vi 
optimal connection cost as C* = jec^v^ij- Then LP* = cost{x*,y*) = F* + C* is the 
joint optimal value of Q and (|2|. We can also associate with each client j its fractional 
connection cost C* = J2ieF^ij^ij- Clearly, C* = Sjec'^- Throughout the paper we will 
use notation OPT for the optimal integral solution of (fTl). OPT is the value we wish to 
approximate, but, since OPT > LP*, we can instead use LP* to estimate the approximation 
ratio of our algorithms. 

Completeness and facility splitting. Define {x*,y*) to be complete if x*j > implies 
that x*j = y* for all In other words, each connection either uses a site fully or not at 
all. As shown by Chudak and Shmoys jEj, we can modify the given instance by adding at 
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most |C| sites to obtain an equivalent instance that has a complete optimal solution, where 
"equivalent" means that the values of F* , C* and LP*, as well as OPT, are not affected. 
Roughly, the argument is this: We notice that, without loss of generality, for each client k 
there exists at most one site i such that < x*^ < y*. We can then perform the following 
facility splitting operation on i: introduce a new site i', let y*, = y* — x*^, redefine y* to be 
x*i^, and then for each client j redistribute x*j so that i retains as much connection value as 
possible and i' receives the rest. Specifically, we set 

y*' ^ y* - x*k, y* ^ a;-^, and 

x*,j ^ max(a;^ - a;-fc,0), x*j ^ min(x^,x*J for all j ^ k. 

This operation eliminates the partial connection between k and i and does not create any 
new partial connections. Each client can split at most one site and hence we shall have at 
most |C| more sites. 

By the above paragraph, without loss of generality we can assume that the optimal 
fractional solution {x*,y*) is complete. This assumption will in fact greatly simplify some 
of the arguments in the paper. Additionally, we will frequently use the facility splitting 
operation described above in our algorithms to obtain fractional solutions with desirable 
properties. 

3. Reduction to Polynomial Demands 

This section presents a demand reduction trick that reduces the problem for arbitrary 
demands to a special case where demands are bounded by |F|, the number of sites. (The 
formal statement is a little more technical - see Theorem [2]) Our algorithms in the sections 
that follow process individual demands of each client one by one, and thus they critically 
rely on the demands being bounded polynomially in terms of |F| and |C| to keep the overall 
running time polynomial. 

The reduction is based on an optimal fractional solution {x*,y*) of LP Q. From the 
optimality of this solution, we can also assume that ^jgjr x*j = rj for all j G C. As explained 
in Section [21 we can assume that (x*, y*) is complete, that is x*^ > implies x*j = y* for all 
i,j. We spht this solution into two parts, namely {x*, y*) = (x, y) + (i, j/), where 

m ^ [y*]^ ^ij ^ and 

yi ^ y* - [y*! > ^ij ^ ^ij - l^ijl 

for all i,j. Now we construct two FTFP instances X and X with the same parameters as the 
original instance, except that the demand of each client j is fj = Yliew ^ij instance X and 
fj = XlieF ^ij ~ ~ instance X. It is obvious that if we have integral solutions to both 
X and X then, when added together, they form an integral solution to the original instance. 
Moreover, we have the following lemma. 

Lemma 1. (i) {x,y) is a feasible integral solution to instance X. 

(ii) {x, y) is a feasible fractional solution to instance X. 

(iii) fj < |F| for every client j . 
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Proof, (i) For feasibility, we need to verify that the constraints of LP (|T| are satisfied. 
Directly from the definition, we have fj = Yliew^ij- ^'^^ ^"^y ^ ^^"^ feasibility of 

{x*,y*) we have Xij = [x*j\ < [y*\ = iji. 

(ii) From the definition, we have = ^jgpijj- It remains to show that iji > Xij for all 
If x*j = 0, then Xij = and we are done. Otherwise, by completeness, we have x*^ = y*. 

Then yi = y* - [y*\ = x*^ - [x*^\ = Xij. 

(iii) From the definition of Xij we have Xij < 1. Then the bound follows from the definition 

of Tj. □ 

Notice that our construction relies on the completeness assumption; in fact, it is easy 
to give an example where {x, y) would not be feasible if we used a non-complete optimal 
solution {x*,y*). Note also that the solutions {x,y) and {x,if) are in fact optimal for their 
corresponding instances, for if a better solution to X or X existed, it could give us a solution 
to X with a smaller objective value. 

Theorem 2. Suppose that there is a polynomial-time algorithm A that, for any instance 
of FTFP with maximum demand bounded by \¥\, computes an integral solution that ap- 
proximates the fractional optimum of this instance within factor p > 1. Then there is a 
p- approximation algorithm A' for FTFP. 

Proof. Given an FTFP instance with arbitrary demands. Algorithm A' works as follows: it 
solves the LP ([T| to obtain a fractional optimal solution (x*, y*), then it constructs instances 
X and X described above, applies algorithm A to X, and finally combines (by adding the 
values) the integral solution {x, y) of X and the integral solution of X produced by A. This 
clearly produces a feasible integral solution for the original instance X. The solution produced 
by A has cost at most p ■ cost{x, if), because (i, y) is feasible for X. Thus the cost of A' is 
at most 

cost{x, y) + p ■ cost{x, y) < p{cost{x, y) + cost{x, y)) = p ■ LP* < p ■ OPT, 
where the first inequality follows from p > 1. This completes the proof. □ 

4. Adaptive Partitioning 

In this section we develop our second technique, which we call adaptive partitioning. 
Given an FTFP instance and an optimal fractional solution {x*, y*) to LP ([T|, we split each 
client j into Vj individual unit demand points (or just demands), and we split each site i into 
no more than |F| + 2_R|Cp facility points (or facilities), where R = max^gc^j- We denote 
the demand set by C and the facility set by F, respectively. We will also partition {x*,y*) 
into a fractional solution [x, y) for the split instance. We will typically use symbols u and 
p to index demands and facilities respectively, that is a; = (x^,^) and y = (y^). As before, 
the neighborhood of a demand v is Ar(z/) = {/i e F : > 0}. We will use notation G j 
to mean that is a demand of client j; similarly, p & i means that facility p is on site i. 
Different demands of the same client (that is, u, v' E j) are called siblings. Further, we use 
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the convention that ff^ = fiioTj2Gi,al = a* for v ^ j and d^^ = d^j = dij for /i G i 
and z/ G j. We define C^""^ = X]/,eiv(iy) d/iiy^fj.u = S/^ef '^m'^^/^;^- O'^^ can think of C^^^ as the 
average connection cost of demand z/, if we chose a connection to facihty fi with probabihty 
x^i,. In our partitioned fractional solution we guarantee for every u that ^^gf 2:^1. = 1. 

Some demands in C will be designated as primary demands and the set of primary 
demands will be denoted by P. By definition we have P C C. In addition, we will use 
the overlap structure between demand neighborhoods to define a mapping that assigns each 
demand z/ G C to some primary demand k G P. As shown in the rounding algorithms 
in later sections, for each primary demand we guarantee exactly one open facility in its 
neighborhood, while for a non-primary demand, there is constant probability that none of 
its neighbors open. In this case we estimate its connection cost by the distance to the facility 
opened in its assigned primary demand's neighborhood. For this reason the connection cost 
of a primary demand must be "small" compared to the non-primary demands assigned to 
it. We also need sibling demands assigned to different primary demands to satisfy the fault- 
tolerance requirement. Specifically, this partitioning will be constructed to satisfy a number 
of properties that are detailed below. 

(PS) Partitioned solution. Vector {x, y) is a partition of (as*, y*), with unit-value demands, 
that is: 

1. ^^gf S^!/ = 1 for each demand z/ G C. 

2- Yli^iei uej ^iJ-'^ ~ ^ij each site z G F and client j G C. 

3- Y^tiaiVfj^ = y* foi' each site i G F. 

(CO) Completeness. Solution {x, y) is complete, that is x^y 7^ implies x^y = y^, for all 
G F, z/ G C. 

(PD) Primary demands. Primary demands satisfy the following conditions: 

1. For any two different primary demands k,,k' & P we have N{k) fl N{k') = 0. 

2. For each site i G F, Xl^ei T.neP^f^'^ ^ 

3. Each demand z/ G C is assigned to one primary demand k, & P such that 

(a) N{iy) n N{k) ^ 0, and 

(b) + a:> + 

(SI) Siblings. For any pair u, u' of different siblings we have 

1. N{u)nN{u') = 0. 

2. If z/ is assigned to a primary demand k then N{v') fl N{k) = 0. In particular. 



by Property (PD,3(a)), this implies that different sibling demands are assigned 



to different primary demands. 
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As we shall demonstrate in later sections, these properties allow us to extend known UFL 
rounding algorithms to obtain an integral solution to our FTFP problem with a matching 
approximation ratio. Our partitioning is "adaptive" in the sense that it is constructed one 
demand at a time, and the connection values for the demands of a client depend on the choice 
of earlier demands, of this or other clients, and their connection values. We would like to point 
out that the adaptive partitioning process for the 1.575-approximation algorithm (Section [?]) 
is more subtle than that for the 3-apprximation (Section |5]) and the 1.736- approximation 
algorithms (Section [6]), due to the introduction of close and far neighborhood. 

Implementation of Adaptive Partitioning. We now describe an algorithm for parti- 
tioning the instance and the fractional solution so that the properties (PS), (CO), (PD), and 
(SI) are satisfied. Recall that F and C, respectively, denote the sets of facilities and demands 
that will be created in this stage, and {x,y) is the partitioned solution to be computed. 

The adaptive partitioning algorithm consists of two phases: Phase 1 is called the par- 
titioning phase and Phase 2 is called the augmenting phase. Phase 1 is done in iterations, 
where in each iteration we find the "best" client j and create a new demand u out of it. This 
demand either becomes a primary demand itself, or it is assigned to some existing primary 
demand. We call a client j exhausted when all its rj demands have been created and assigned 
to some primary demands. Phase 1 completes when all clients are exhausted. In Phase 2 
we ensure that every demand has a total connection values x^i, equal to 1, that is condition 
(PS§. 

For each site i we will initially create one "big" facility fi with initial value = y* . While 
we partition the instance, creating new demands and connections, this facility may end up 
being split into more facilities to preserve completeness of the fractional solution. Also, 
we will gradually decrease the fractional connection vector for each client j, to account for 
the demands already created for j and their connection values. These decreased connection 
values will be stored in an auxiliary vector x. The intuition is that x represents the part of 
X* that still has not been allocated to existing demands and future demands can use x for 
their connections. For technical reasons, x will be indexed by facilities (rather than sites) 
and clients, that is x = iXfj,j). At the beginning, we set x^j-^x*^ for each j G C, where 
/i G 2 is the single facility created initially at site i. At each step, whenever we create a new 
demand z/ for a client j, we will define its values x^y and appropriately reduce the values 
x^j, for all facilities /x. We will deal with two types of neighborhoods, with respect to x and 
X, that is iV(j) = {/i G ¥:x^j > 0} for j G C and N{i^) = {fi E ¥ : x^^ > 0} for z/ G C. 
During this process we preserve the completeness (CO) of the fractional solutions x and x. 
More precisely, the following properties will hold for every facility fi after every iteration: 

(cl) For each demand u either x^j, = or x^i, = y^. This is the same condition as condition 
(CO), yet we repeat it here as (cl) needs to hold after every iteration, while condition 
(CO) only applies to the final partitioned fractional solution {x,y). 

(c2) For each client j, either x^j = or x^j = y^. 

A full description of the algorithm is given in Pseudocode [T] Initially, the set U of non- 
exhausted clients contains all clients, the set C of demands is empty, the set F of facilities 
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consists of one facility /i on each site i with = y*, and the set P of primary demands 
is empty (Lines 1-4). In one iteration of the while loop (Lines 5-8), for each client j we 
compute a quantity called tcc(j) (tentative connection cost), that represents the average 
distance from j to the set Ni{j) of the nearest facilities fi whose total connection value to j 
(the sum of x^/s) equals 1. This set is computed by Procedure NearestUnitChunk() (see 
Pseudocode^ Lines 1-9), which adds facilities to Ni{j) in order of nondecreasing distance, 
until the total connection value is exactly 1. (The procedure actually uses the values, 
which are equal to the connection values, by the completeness condition (c2).) This may 
require splitting the last added facility and adjusting the connection values so that conditions 
(cl) and (c2) are preserved. 

Pseudocode 1 Algorithm: Adaptive Partitioning 
Input: F, C, {x%y*) 

Output: F, C, {x, y) > Unspecified x^i^'s and assumed to be 

1 
2 
3 
4 

5 
6 
7 
8 

9 

10 
11 
12 
13 
14 
15 
16 

17 
18 

19 
20 
21 



r^r,[/^C,F^0,C^0,P^0 > Phase 1 

for each site z G F do 

create a facility yU at z and add yU to F 
y^i ^ y* and Xfj,j x*j for each j G C 

while t/ 7^ do 

for each j & U do 

^1 (j) ^ NEARESTUNlTCHUNK(j, ¥,x,x,y) > see Pseudocode [i] 

tcc(j) ^ Zl/^eTViO) '^M ' 

]9^argmin^g^{tcc(j) + a*} 
create a new demand v for client p 

if Ni{p) n N{k) 7^ for some primary demand k G P then 
assign u to k, 

Xfj^v ^ x^p and x^p ^ for each /i G N{p) fl N{k) 
else 

make u primary, P^P U {u}, assign u to itself 
set x^u -(r- x^p and x^p ^ for each E Ni (p) 

C^CU{u}, rp^rp - 1 
if fp = then U^U\{p} 

for each client j G C do > Phase 2 

for each demand i/ G j do > each client j has rj demands 

if '^^leN{u) ^M^^ < 1 then AugmentToUnit(z/, j, F, x, x, y) > see Pseudocode 2] 



The next step is to pick a client p with minimum tcc(p) + a* and create a demand v for p 

(Lines 9-10). If Ni{p) overlaps the neighborhood of some existing primary demand k, (if there 
are multiple such k's, pick any of them), we assign u to k, and u acquires all the connection 
values Xf^p between client p and facility fi in N{p) fl A^(/t) (Lines 11-13). Note that although 
we check for overlap with Ni{p), we then move all facilities in the intersection with N{p), a 
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Pseudocode 2 Helper functions used in Pseudocode [T] 



1: function NEARESTUNlTCHUNK(j, F, a;, cc, t/) > upon return, ^^g^j(j)X^j = 1 

2: Let N{j) = {/ii, fig} where df,^j < d^^j < ... < d^^. 

3: Let / be such that Ylk=i Vf^k > 1 and y^^ < 1 _ 

4: Create a new facihty a at the same site as /i/ and add it to F > spht /i; 

5: Set ^ Yl[=i " ^ a^^d ^ - y^ 

6: For each z/ G C with x^jj, > set x^^^^ ^ y^, and x„,y ^ y„ 

7: For each j' G C with x^^jl > (including j) set J^jj/ ^|//^, and Xo-j' ^l/o- 

8: (All other new connection values are set to 0) 

9: return A^i(j) = ... 

10: function AugmentToUnit(z/, j, F, x, x,y) > z/ is a demand of client j 

11: while J^tM&^i^i' < 1 do > upon return, J2^,eN{u) ^m^^ = 1 

12: Let rj be any facility such that Xnj > 

13: if 1 - Y.t,& ^iJ-y > ^ni ^^en 

15: else 

16: Create a new facility a at the same site as r] and add it to F > split rj 

17: Let 2/^ ^ 1 - Xl^^eF ^t^u.y'q ^ Vr, - 

18: Set X^u ^ ^fj; ^ 0? "^^i ^ I/'?' '^crj ^ ^ 

19: For each v' ^ v with x^^/ > 0, set x^j^/ ^Z/r?j a^o-;/' ^l/o- 

20: For each j' 7^ j with x^^/ > 0, set x^j/ ^ |/^, Xo-/ ^ |/o- 

21: (All other new connection values are set to 0) 



bigger set, into N{v\ The other case is when Ni{p) is disjoint from the neighborhoods of all 
existing primary demands. Then, in Lines 15-16, u becomes itself a primary demand and 
we assign u to itself. It also inherits the connection values to all facilities /i G Ni{p) from p 
(recall that x^p = y^), with all other x^i, values set to 0. 

At this point all primary demands satisfy Property (PS{T|, but this may not be true 
for non-primary demands. For those demands we still may need to adjust the x^,y values 

so that the total connection value for u, that is conn(i/) =^ S^eF^/^!^' equal 1. This is 
accomplished by Procedure AugmentToUnit() (definition in Pseudocode [2| Lines 10-21) 
that allocates to u E j some of the remaining connection values x^j of client j (Lines 19-21). 
AugmentToUnit() will repeatedly pick any facihty rj with x^^ > 0. If x^j < 1 — conn(i/), 
then the connection value x^j is reassigned to u. Otherwise, x^j > 1 — conn(z/), in which case 
we split f] so that connecting u to one of the created copies of t] will make conn(i/) equal 1, 
and we'll be done. 

Notice that we start with |F| facilities and in each iteration of the while loop in Line 5 
(Pseudocode [1]) each client causes at most one split. We have a total of no more than i?|C| 
iterations as in each iteration we create one demand. (Recall that R = maxj rj.) In Phase 2 
we do an augment step for each demand u and this creates no more than i?|C| new facilities. 
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So the total number of facilities we created will be at most |F|+i?|Cp + i?|C|<|F| + 2i?|Cp, 
which is polynomial in |F| + |C| due to our earlier bound on R. 

Example. We now illustrate our partitioning algorithm with an example, where the FTFP 
instance has four sites and four clients. The demands are ri = 1 and r2 = = = 2. The 
facility costs are /j = 1 for all i. The distances are defined as follows: da = 3 for i = 1, 2, 3, 4 
1 for all i ^ j. Solving the LP([T]), we obtain the fractional solution given in 
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(b) 



Table 1: An example of an execution of the partitioning algorithm, (a) An optimal fractional 
solution x*,y*. (b) The partitioned solution, j' and j" denote the first and second demand 
of a client j, and i and 'i denote the first and second facility at site i. 



la 



is optimal and complete {x*j > 



4/3 for j = 1,2,3,4. 



It is easily seen that the fractional solution in Table 
implies x*j = y*). The dual optimal solution has all a* 

Now we perform Phase 1, the adaptive partitioning, following the description in Pseu- 
docode [T} To streamline the presentation, we assume that all ties are broken in favor of 
lower-numbered clients, demands or facilities. First we create one facility at each of the four 
sites, denoted as i, 2, 3 and 4 (Line 2-4, Pseudocode [l]). We then execute the "while" loop 
in Line 5 Pseudocode [TJ This loop will have seven iterations. Consider the first iteration. 
In Line 7-8 we compute tcc(j) for each client j = 1,2,3,4 in U. When computing iVi(2), 
facility 1 will get split into 1 and 1 with = 1 and yi = 1/3. (This will happen in Line 4-7 
of Pseudocode |2|) Then, in Line 9 we will pick client p = 1 and create a demand denoted as 



1' (see Table lb). Since there are no primary demands yet, we make 1' a primary demand 
with A^(l') = A^i(l) = {2,3,4}. Notice that client 1 is exhausted after this iteration and U 
becomes {2, 3, 4}. 

In the second iteration we compute tcc(j) for j = 2, 3, 4 and pick client p = 2, from which 
we create a new demand 2'. We have iVi(2) = {1}, which is disjoint from iV(l'). So we create 
a demand 2' and make it primary, and set N{2') = {1}. In the third iteration we compute 
tcc(j) for j = 2,3,4 and again we pick client p = 2. Since A'^i(2) = {1,3,4} overlaps with 
7V(1'), we create a demand 2" and assign it to 1'. We also set N{2") = iV(l') niV(2) = {3, 4}. 
After this iteration client 2 is exhausted and we have U = {3,4}. 

In the fourth iteration we compute tcc(j) for client j = 3,4. We pick p = 3 and create 
demand 3'. Since A''i(3) = {1} overlaps N{2'), we assign 3' to 2' and set A^(3') = {1}. In 
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the fifth iteration we compute tcc(j) for chent j = 3,4 and pick p = 3 again. At this time 
A^i(3) = {1,2,4}, which overlaps with A^(l')- So we create a demand 3" and assign it to 1', 
as well as set N{3") = {2,4}. 

In the last two iterations we will pick client p = 4 twice and create demands 4' and 4". 
For 4' we have A^i(4) = {1} so we assign 4' to 2' and set A^(4') = {1}. For 4" we have 
iVi(4) = {i, 2, 3} and we assign it to 1', as well as set iV(4") = {2, 3}. 

Now that all clients are exhausted we perform Phase 2, the augmenting phase, to con- 
struct a fractional solution in which all demands have total connection value equal to 1. We 
iterate through each of the seven demands created, that is 1', 2', 2", 3', 3", 4', 4". 1' and 2' 
already have neighborhoods with total connection value of 1, so nothing will change in the 
first two iterations. 2" has 3,4 in its neighborhood, with total connection value of 2/3, and 
iV(2) = {1} at this time, so we add 1 into N{2") to make N{2") = {1,3,4} and now 2" has 
total connection value of 1. Similarly, 3" and 4" each get 1 added to their neighborhood and 
end up with total connection value of 1. The other two demands, namely 3' and 4', each 
have 1 in its neighborhood so each of them has already its total connection value equal 1. 
This completes Phase 2. 



The final partitioned fractional solution is given in Table lb We have created a total of 
five facilities 1, 1, 2, 3, 4, and seven demands, 1', 2', 2", 3', 3", 4', 4". It can be verified that all 
the stated properties are satisfied. 

Correctness. We now show that all the required properties (PS), (CO), (PD) and (SI) are 
satisfied by the above construction. 

Properties (PS) and (CO) follow directly from the algorithm. (CO) is implied by the com- 
pleteness condition (cl) that the algorithm maintains after each iteration. Condition (PSjl]) 
is a result of calling Procedure AugmentToUnit() in Line 21. To see that (PS|2]) holds, 
note that at each step the algorithm maintains the invariant that, for every i G F and j G C, 
we have ^^gj Xli/ei ^a'^' ~'~ ^Mei^w ~ -^ij- ^^^^ create Vj demands for each 

client j, with each demand i' G j satisfying (PSjT|, and thus ^j^g^X^^gf a^/^i^ = rj. This 
implies that x^^j = for every facility G F, and (PS|2| follows. (PS|3| holds because every 
time we split a facility fi into /x' and /x", the sum of |/^/ and y^'i is equal to the old value of 

Now we deal with properties in group (PD). First, (PD{T]) follows directly from the 
algorithm. Pseudocode [l] (Lines 14-16), since every primary demand has its neighborhood 
fixed when created, and that neighborhood is disjoint from those of the existing primary 
demands. 

Property {PDQ follows from (PDjl}, (CO) and {PSQ. In more detail, it can be justified 
as follows. By (PDjl]), for each fi G i there is at most one k G P with x^^ > and we have 
^ij.K = Vfi due do (CO). Let K C i he the set of those /I's for which such k E P exists, and 
denote this k by k^. Then, using conditions (CO) and (PS|3j), we have Xl^ei Skgp ^' 



Property (PD,3(a) ) follows from the way the algorithm assigns primary demands. When 



demand z/ of client p is assigned to a primary demand k in Lines 11-13 of Pseudocode [T} we 
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move all facilities in N{p) fl N{k,) (the intersection is none mpty) into N{v), and we never 
remove a facility from N{u). We postpone the proof for (PD,3(b)) to Lemma [s] 



Finally we argue that the properties in group (SI) hold. (Sl]lf is easy, since for any client 
j, each facility is added to the neighborhood of at most one demand z/ G j, by setting Xfj^u to 
y^, while other siblings u' of z/ have Xfj,^' = 0. Note that right after a demand u E pis created, 
its neighborhood is disjoint from the neighborhood of p, that is N{i')r\N{p) = 0, by Lines 11- 
13 of the algorithm. Thus all demands of p created later will have neighborhoods disjoint from 
the set A^(z^) before the augmenting phase 2. Furthermore, Procedure AugmentToUnit() 
preserves this property, because when it adds a facility to N{i>) then it removes it from 
N{p), and in case of splitting, one resulting facility is added to iV(z/) and the other to N{p). 
Property (Sl|2]) is shown below in Lemma 3] 



It remains to show Properties (PD,3(b) ) and (SlJ2]). We show them in the lemmas below 



thus completing the description of our adaptive partition process. 
Lemma 3. Property (Sl\^ holds after the Adaptive Partitioning stage. 



Proof. Let z/i, . . . , z/^^. be the demands of a client j G C, listed in the order of creation, and, 
for each g = 1, 2, . . . , rj, denote by k„ the primary demand that Uq is assigned to. After the 
completion of Phase 1 of Pseudocode hi (Lines 5-18), we have A^(z^s) ^ N(ks) for s = 1, . . . , Vj. 
Since any two primary demands have disjoint neighborhoods, we have A^(z^s) fl N{Kq) = 
for any s ^ q, that is Property (Slj2^holds right after Phase 1. 

After Phase 1 all neighborhoods N{k,s), s = 1, . . . ,rj have already been fixed and they do 
not change in Phase 2. None of the facilities in N{j) appear in any of N{k,s) for s = 1, . . . , r^, 
by the way we allocate facilities in Lines 13 and 16. Therefore during the augmentation 
process in Phase 2, when we add facilities from N{j) to A^(z/), for some v E j (Line 19-21 
of Pseudocode [T|, all the required disjointness conditions will be preserved. □ 



We need one more lemma before proving our last property (PD,3(b) ). For a client j and 
a demand z/, we use notation tcc^(j) for the value of tcc(j) at the time when z/ was created. 
(It is not necessary that z/ G j but we assume that j is not exhausted at that time.) 

Lemma 4. Let rj and v he two demands, with t] created no later than v, and let j E C be a 
client that is not exhausted when v is created. Then we have 

(a) tcc''(j) < tcc^(j), and 

(h)ifvej then tcc^(j) < Q^e. 

Proof. We focus first on the time when demand rj is about to be created, right after the 
call to NearestUnitChunk() in Pseudocode [1} Line 7. Let N{j) = {/ii, /x,} with all 
facilities fig ordered according to nondecreasing distance from j. Consider the following 
linear program: 

mmimize > d^^^Zg 



subject to } > 1 
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< < x^^j for all s 

This is a fractional minimum knapsack covering problem (with knapsack size equal 1) and 
its optimal fractional solution is the greedy solution, whose value is exactly tcc''(j). 

On the other hand, we claim that tcc'^(j) can be thought of as the value of some feasible 
solution to this linear program, and that the same is true for C^^^ if G j. Indeed, each of 
these quantities involves some later values x^j, where could be one of the facilities fig or 
a new facility obtained from splitting. For each s, however, the sum of all values x^j, over 
the facilities fi that were split from /i^, cannot exceed the value x^^j at the time when t] was 
created, because splitting facilities preserves this sum and creating new demands for j can 
only decrease it. Therefore both quantities tcc^(j) and C^^^ (for z/ G j) correspond to some 
choice of the Zg variables (adding up to 1), and the lemma follows. □ 



Lemma 5. Property (PD.3(b)) holds after the Adaptive Partitioning stage. 

Proof. Suppose that demand z/ G j is assigned to some primary demand k, & p. Then 

cT' + < = tce{p) + c^; < tcc'^(j) + "* < + 

We now justify this derivation. By definition we have a* = a*. Further, by the algorithm, if 
K is a primary demand of client p, then C^^^ is equal to tcc(p) computed when k is created, 
which is exactly tcc^{p). Thus the first equation is true. The first inequality follows from 
the choice of p in Line 9 in Pseudocode pi The last inequality holds because a* = al (due 
to z/ G j), and because tcc'^(j) < C^^^, which follows from Lemma |4| □ 



We have thus proved that all properties (PS), (CO), (PD) and (SI) hold for our partitioned 
fractional solution {x,y). In the following sections we show how to use these properties to 
round the fractional solution to an approximate integral solution. For the 3-approximation 
algorithm (Section [sj and the 1.736-approximation algorithm (Section [6]), the first phase of 
the algorithm is exactly the same partition process as described above. However, the 1.575- 
approximation algorithm (Section [?]) demands a more sophisticated partitioning process as 
the interplay between close and far neighborhood of sibling demands result in more delicate 
properties that our partitioned fractional solution must satisfy. 



5. Algorithm EGUP with Ratio 3 

With the partitioned FTFP instance and its associated fractional solution in place, we 
now begin to introduce our rounding algorithms. The algorithm we describe in this section 
achieves ratio 3. Although this is still quite far from our best ratio 1.575 that we derive 
later, we include this algorithm in the paper to illustrate, in a relatively simple setting, how 
the properties of our partitioned fractional solution are used in rounding it to an integral 
solution with cost not too far away from an optimal solution. The rounding approach we 
use here is an extension of the corresponding method for UFL described in [8j. 
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Algorithm EGUP. At a high level, we would open exactly one facility for each primary 
demand k, and each non-primary demand is connected to the facility opened for the primary 
demand it was assigned to. 

More precisely, we apply a rounding process, guided by the fractional values (y^) and 
(x^i,), that produces an integral solution. This integral solution is obtained by choosing a 
subset of facilities in F to open, and for each demand in C, specifying an open facility that 
this demand will be connected to. For each primary demand k G P, we want to open one 
facility G N{k,). To this end, we use randomization: for each ^ G N{k), we choose 
= /i with probability x^^, ensuring that exactly one /i G A(k) is chosen. Note that 
S^ie]v(K) -^MK ~ ^^^^ distribution is well-defined. We open this facility and connect 
to all demands that are assigned to k. 

In our description above, the algorithm is presented as a randomized algorithm. It can 
be de-randomized using the method of conditional expectations, which is commonly used in 
approximation algorithms for facility location problems and standard enough that presenting 
it here would be redundant. Readers less familiar with this field are recommended to consult 
[3] , where the method of conditional expectations is applied in a context very similar to ours. 

Analysis. We now bound the expected facility cost and connection cost by establishing the 
two lemmas below. 

Lemma 6. The expectation of facility cost -Fegup of our solution is at most F* . 

Proof By Property (PDjT|), the neighborhoods of primary demands are disjoint. Also, for 
any primary demand k, & P, the probability that a facility fi G N{k,) is chosen as the open 
facility is x^^- Hence the expected total facility cost is 



where the inequality follows from Property (PD|2]). □ 

Lemma 7. The expectation of connection cost Cegup of our solution is at most C* + 2- LP* . 

Proof. For a primary demand k, its expected connection cost is C^^^ because we choose 
facility fi with probability x^^- 

Consider a non-primary demand p assigned to a primary demand k G P. Let fi be any 
facility in A(z/) fl N{k). Since /i is in both A(z/) and A(k), we have rf^i, < a* and c/^^ < a* 
(This follows from the complementary slackness conditions since a* = (3*^^ + df^u for each 
fl G A(z/).). Thus, applying the triangle inequality, for any fixed choice of facility we 
have 

d^{K)u < d^{K)K + dfj,^ + d^y < d^(^i^)i^ + a* + a*. 
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Therefore the expected distance from v to its facihty is 



where the second inequahty follows from Property (PD,3(b)). From the definition of C^^^ 
and Property (PS|2|, for any j G C we have 

^uej ^""^ " ^.ej 5^/.eF "^^'^^'^^ 

Thus, summing over all demands, the expected total connection cost is 

= E,ec(q + 2r,«*)=C* + 2-LP*, 
completing the proof of the lemma. □ 
Theorem 8. Algorithm EGUP is a 3 -approximation algorithm. 



Proof. By Property (SI|2j), different demands from the same client are assigned to different 
primary demands, and by (PDjl]) each primary demand opens a different facility. This 
ensures that our solution is feasible, namely each client j is connected to rj different facilities 
(some possibly located on the same site). As for the total cost. Lemma |6] and Lemma [T] imply 
that the total cost is at most F* + C* + 2 ■ LP* = 3 ■ LP* < 3 • OPT. □ 



6. Algorithm ECHS with Ratio 1.736 

In this section we improve the approximation ratio to 1 + 2/e ~ 1.736. The improvement 
comes from a slightly modified rounding process and refined analysis. Note that the facility 
opening cost of Algorithm EGUP does not exceed that of the fractional optimum solution, 
while the connection cost could be far from the optimum, since we connect a non-primary 
demand to a facility in the neighborhood of its assigned primary demand and then estimate 
the distance using the triangle inequality. The basic idea to improve the estimate of the 
connection cost, following the approach of Chudak and Shmoys [3], is to connect each non- 
primary demand to its nearest neighbor when one is available, and to only use the facility 
opened by its assigned primary demand when none of its neighbors is open. 

Algorithm ECHS. As before, the algorithm starts by solving the linear program and 
applying the adaptive partitioning algorithm described in Section |4] to obtain a partitioned 
solution {x,y). Then we apply the rounding process to compute an integral solution (see 
Pseudocode [3]). 
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We start, as before, by opening exactly one facility in the neighborhood of each 
primary demand k (Line 2). For any non-primary demand v assigned to k, we refer to 
as the target facility of v. In Algorithm EGUP, v was connected to but in 

Algorithm ECHS we may be able to find an open facility in i/'s neighborhood and connect 
V to this facility. Specifically, the two changes in the algorithm are as follows: 

(1) Each facility /i that is not in the neighborhood of any primary demand is opened, 
independently, with probability (Lines 4-5). Notice that if > then, due to 
completeness of the partitioned fractional solution, we have = x^y for some demand 
u. This implies that < 1, because x^^, < 1, by (PSjl]). 

(2) When connecting demands to facilities, a primary demand k is connected to the only 

facility 0(k) opened in its neighborhood, as before (Line 3). For a non-primary demand 
z/, if its neighborhood N^v) has an open facility, we connect v to the closest open facility 
in N{v) (Line 8). Otherwise, we connect v to its target facility (Line 10). 



Pseudocode 3 Algorithm ECHS: Constructing Integral Solution 
1: for each /t G P do 

2: choose one 0(ft) G N{k), with each fx G N{k) chosen as 0(k) with probability y^ 
3: open 0(k) and connect n to (/)(k) 

4: for each /i G F - [j^^p N{k) do 

5: open fi with probability (independently) 

6: for each non-primary demand z/ G C do 

7: if any facility in A^(z/) is open then 

8: connect u to the nearest open facility in N{i>) 

9: else 

10: connect u to 0(k) where k, is u's assigned primary demand 



Analysis. We shall first argue that the integral solution thus constructed is feasible, and 
then we bound the total cost of the solution. Regarding feasibility, the only constraint that 
is not explicitly enforced by the algorithm is the fault-tolerance requirement; namely that 
each client j is connected to rj different facilities. Let u and u' be two different sibling 
demands of client j and let their assigned primary demands be k and k' respectively. Due 
to (Sl|2[^we know k ^ k'. From (SlQ we have NM f] N{iy') = 0. From (SlQ, we have 
N{iy)nN{K') = and N{iy')nN{K) = 0. From (PDQ we have N{k) nN{K') = 0. It follows 
that {N{i') UN{k)) n {Nlu') UN{k')) = 0. Since the algorithm connects u to some facility in 
A^(z/) U N{k) and u' to some facility in N{v') U N{k,'), v and v' will be connected to different 
facilities. 

We now show that the expected cost of the computed solution is bounded by (l-f-2/e)-LP*. 
By (PDjl]), every facility may appear in at most one primary demand's neighborhood, and 
the facilities open in Line 4-5 of Pseudocode [3] do not appear in any primary demand's 
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neighborhood. Therefore, by hnearity of expectation, the expected facihty cost of Algo- 
rithm ECHS is 

where the third equality follows from (PSjs]). 

To bound the connection cost, we adapt an argument of Chudak and Shmoys [5j. Consider 
a demand v and denote by the random variable representing the connection cost for v. 
Our goal now is to estimate ]E[C;^], the expected value of Cv Demand v can either get 
connected directly to some facility in N{v) or indirectly to its target facility G N{k), 
where k is the primary demand to which u is assigned. We will analyze these two cases 
separately. 

In our analysis, in this section and the next one, we will use notation 

for the average distance between a demand a and a set A of facilities. Note that, in particular, 
we have C^"^ = D(N{u), u). 

We first estimate the expected cost (i<^(K)jy of the indirect connection. Let A" denote the 
event that some facility in N{i>) is opened. Then 

E[a I -A1 = ndH^)u I -A1 = D(N{k) \ N{iy), u). (3) 

Note that -lA"^ implies that N{k) \ Niy) ^ 0, since N{k) contains exactly one open facility, 
namely 

Lemma 9. Let v he a demand assigned to a primary demand k, and assume that N{k) \ 
N{iy) 0. Then 



Proof. By (|3j), we need to show that D{N{k) \ N{u), v) < + 2a*. There are two cases 
to consider. 

Case 1 : There exists some ji' G N{k) fl N{u) such that (i^/K < C^^. In this case, for every 
/i G ~N{n) \ N{u), we have 

< d,, + d^,, + d,,, < «: + +a:< cr^ + 2«:, 

using the triangle inequality, complementary slackness, and (PP. 3(b)"] ). By summing 
over all /i G N{k,) \ N{iy), it follows that D(N{k) \ N{iy), u) < + 2a*. 

Case 2 : Everj^/x' G W{k) n iV(i/) has d^,^ > Cf^. Since Cf^ = D(N{k), K)^this implies 
that D{N{k) \ N{u),k) < C^^. Therefore, choosing an arbitrary /z' G N{k) fl N{u), 
we obtain 



where we again use the triangle inequality, complementary slackness, and (PD,3(b)). 
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Since the lemma holds in both cases, the proof is now complete. 



□ 



We now continue our estimation of the connection cost. The next step of our analysis is 
to show that 

E[a] < + '^al. (4) 

The argument is divided into three cases. The first, easy case is when z/ is a primary demand 
K. According to the algorithm (see Pseudocode^ Line 2), we have = (i^K with probability 
y^, for /i e N{k). Therefore E[C^] = so Q holds. 

Next, we consider a non-primary demand u. Let k be the primary demand that u is 
assigned to. We first deal with the sub-case when N{k) \ N^v) = 0, which is the same 
as N{k) C Nijj). Property (CO) implies that x^j.^ = yfj, = x^^ for every fi G N{k,), so 
we have X]/^e7V(K) -^/^i^ ~ X]^ie]v(/t) -^mk ~ (PS|T|. On the other hand, we have 

Yli^ieN{v)^t^'^ = 1, and x^„ > for all /i G Niy). Therefore N{k) = N{u) and Cy has 
exactly the same distribution as C^- So this case reduces to the first case, namely we have 
E[a] = ^ and (g holds. _ _ 

The last, and only non-trivial case is when N{k) \ N{u) ^ 0. We handle this case in the 
following lemma. 

Lemma 10. Assume that N{k) \ N{v) ^ 0. Then the expected connection cost of v, condi- 
tioned on the event that at least one of its neighbor opens, satisfies 

Proof. The proof is similar to an analogous result in [5[ |2] . For the sake of completeness we 
sketch here a simplified argument, adapted to our terminology and notation. The idea is to 
consider a different random process that is easier to analyze and whose expected connection 
cost is not better than that in the algorithm. 

We partition N{v) into groups Gi, ..■,Gk, where two different facilities fi and fi' are put 
in the same Gg, where s G {1, . . . , k}, if they both belong to the same set N{k,) for some 
primary demand k. If some /i is not a neighbor of any primary demand, then it constitutes a 
singleton group. For each s, let dg = D{Gs, f) be the average distance from v to Gg- Assume 
that Gi, ...,Gk are ordered by nondecreasing average distance to z/, that is di < d2 < ■■■ < dk- 
For each group Gg, we select it, independently, with probability gg = J^f^eGs ^m- each 
selected group Gg, we open exactly one facility in Gg, where each /i G is opened with 
probability y^/ Er,eG. Vv _ 

So far, this process is the same as that in the algorithm (if restricted to N{i>)). However, 
we connect i/ in a slightly different way, by choosing the smallest s for which Gg was selected 
and connecting i/ to the open facility in Gg. This can only increase our expected connection 
cost, assuming that at least one facility in A^(z^) opens, so 

E|a I A1 < jpj^ (<iiJi + 492(1 - Ji) + . . . + 4sfc(l - - 92) ■ . . (1 - 9t)) 
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= ^'9s (6) 

= (7) 

The proof for inequality ([S) is given in Appendix B (note that X]s=i fi's = 1), equahty ^ 
follows from PfA'^] = 1 - nLi(l - 9t) = 'Zt=i9tIlT=\i^ - 9z), and Q follows from the 
definition of the distances ds, probabilities Qs, and simple algebra. □ 

Next, we show an estimate on the probability that none of z/'s neighbors is opened by 
the algorithm. 

Lemma 11. The probability that none of u's neighbors is opened satisfies P[-iA'^] < 1/e. 
Proof. We use the same partition of A^(z^) into groups Gi, as in the proof of Lemma 
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Denoting by gs the probability that a group Gg is selected (and thus that it has an open 
facility), we have 

Fl-nA"] =Y[^ (1 - g^) < e-^'=i»- = e'^MS^M^M = 1. 

S — 1 ^ 

In this derivation, we first use that 1 — a; < holds for all x, the second equality follows 
from X;J=i 9s = T.f,eN{u) and the last equality follows from E/.e7V(i/) = 1- ^ 

We are now ready to estimate the unconditional expected connection cost of u (in the 
case when N{k,) \ N{v) ^ 0) as follows: 

E[a] = E[a I Ai ■ p[Ai + E[a i -ai ■ pha'^] 

< C^^s . p[A-] + (Cf s + 2al) ■ ^-^K"] (8) 
= Cl''^ + 2al ■ P[^A^] 

< cr^ +-■«:. (9) 

e 

In the above derivation, inequality ([s]) follows from Lemmas [9] and 10 , and inequality (|9]) 
follows from Lemma [TTl 

We have thus shown that the bound (|4]) holds in all three cases. Summing over all 
demands z/ of a client j, we can now bound the expected connection cost of client j: 

Finally, summing over all clients j, we obtain our bound on the expected connection cost, 

E[C,cHs] <C* + ^-LP*. 

Therefore we have established that our algorithm constructs a feasible integral solution with 
an overall expected cost 

E[Fechs + Czechs] < F* + C* + - ■ LF* = {1 + 2/e) ■ LP* < (1 + 2/e) ■ OPT. 

e 

Summarizing, we obtain the main result of this section. 

Theorem 12. Algorithm ECHS is a {1 + 2/e) -approximation algorithm for FTFP. 
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7. Algorithm EBGS with Ratio 1.575 



In this section we give our main result, a 1.575-approximation algorithm for FTFP, where 
1.575 is the value of min-y>i max{7, 1 + 2/e'>', ^^j^}, rounded to three decimal digits. This 
matches the ratio of the best known LP-rounding algorithm for UFL by Byrka et al. [3]. 

Recall that in Section [6] we showed how to compute an integral solution with facility cost 
bounded by F* and connection cost bounded by C* + 2/e ■ LP*. Thus, while our facility cost 
does not exceed the optimal fractional facility cost, our connection cost is significantly larger 
than the connection cost in the optimal fractional solution. A natural idea is to balance 
these two ratios by reducing the connection cost at the expense of the facility cost. One 
way to do this would be to increase the probability of opening facilities, from (used in 
Algorithm ECHS) to, say, jy^, for some 7 > 1. This increases the expected facility cost by 
a factor of 7 but, as it turns out, it also reduces the probability that an indirect connection 
occurs for a non-primary demand to (from the previous value 1/e in ECHS). As a 
consequence, for each primary demand k, the new algorithm will select a facility to open 
from the nearest facilities fi in N{k) such that the connection values x^j, sum up to I/7, 
instead of 1 as in Algorithm ECHS. It is easily seen that this will improve the estimate 
on connection cost for primary demands. These two changes, along with a more refined 
analysis, are the essence of the approach in [3], expressed in our terminology. 

Our approach can be thought of as a combination of the above ideas with the techniques 
of demand reduction and adaptive partitioning that we introduced earlier. However, our 
adaptive partitioning technique needs to be carefully modified, because now we will be using 
a more intricate neighborhood structure, with the neighborhood of each demand divided into 
two disjoint parts, and with restrictions on how parts from different demands can overlap. 

We begin by describing properties that our partitioned fractional solution {x, y) needs 
to satisfy. Assume that 7 is some constant such that 1 < 7 < 2. As mentioned earlier, the 
neighborhood N[i') of each demand u will be divided into two disjoint parts. The first part, 
called the close neighborhood and denoted Acis(z^), contains the facilities in A(i/) nearest to 
u with the total connection value equal I/7, that is J^fieN j {u) ^m^^ = 1/7- The second part, 
called the far neighborhood and denoted Afar(z^), contains the remaining facilities in A(z/) 
(so J2iieNf {u) ^Mi' ~ 1 ~ 1/7)- We restate these definitions formally below in Property (NB). 
Recall that for any set A of facilities and a demand z/, by D{A,v) we denote the average 
distance between v and the facilities in A, that is D{A, v) = J^^ieA '^m'^I/m/ Xl^e^i ^^^^ 
use notations C^]^^(z/) = D{Ncis{i^), i^) and Cf^^^(z/) = D(Afar(z^), z^) for the average distances 
from u to its close and far neighborhoods, respectively. By the definition of these sets and 
the completeness property (CO), these distances can be expressed as 

We will also use notation C^^^^u) = max^^j^ (y-, d^^ for the maximum distance from z/ to 
its close neighborhood. The average distance ^rom a demand u to its overall neighborhood 
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A^(z/) is denoted as C^^^^u) = D{N{u),u) = J2fieN{u) d-fiuX^u- It is easy to see that 

c-Hi^) = l^cz'ii^) + (10) 

Our partitioned solution {x, y) must satisfy the same partitioning and completeness 
properties as before, namely properties (PS) and (CO) in Section |4j In addition, it must 
satisfy a new neighborhood property (NB) and modified properties (PD') and (SP), listed 
below. 

(NB) Neighborhoods. For each demand z/ G C, its neighborhood is divided into close and 
far neighborhood, that is Ni^u) = A^cis('^) U A^far('^), where 

• iV,i,(z/)niVfar(//) =0, 

• E^e]v,i,H ^M^' = 1/7, and 

• if yU G A^cis('^) and /i' G A^far('^) then d^,^ < d^>^. 

Note that the first two conditions, together with (PS|T|, imply that X^^elVj {u)^f^'^ ~ 
1 — 1/7. When defining A^cis(^), in case of ties, which can occur when some facilities 
in iV(z/) are at the same distance from u, we use a tie-breaking rule that is explained 



in the proof of Lemma 13 (the only place where the rule is needed). 



(PD') Primary demands. Primary demands satisfy the following conditions: 

1. For any two different primary demands k, k' & P we have NcIs{k,) H Nc\s{k,') = 0. 

2. For each site i G F, X^ksp S^teinTv j (k) — Vi- the summation, as before, we 
overload notation i to stand for the set of facilities created on site i. 

3. Each demand G C is assigned to one primary demand n E P such that 

(a) iVcis(z/) niVeis(K) ^ 0, and 

(b) CZ^u) + C2r{y) > CZ%k) + C^ri^). 

(SP) Siblings. For any pair z/, z/' G C of different siblings we have 

1. N{u)nN{iy') = 0. 

2. If z/ is assigned to a primary demand k then N{v') fl A^cis(/^) = 0- In particular. 



by Property (PD',3(a)), this implies that different sibling demands are assigned 



to different primary demands, since Nc\s{t^') is a subset of Ni^p') 



Modified adaptive partitioning. To obtain a fractional solution with the above proper- 
ties, we employ a modified adaptive partitioning algorithm. As in Section |4], we have two 
phases. In Phase 1 we split clients into demands and create facilities on sites, while in Phase 2 
we augment each demand's connection values x^i, so that the total connection value of each 
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demand z/ is 1. As the partitioning algorithm proceeds, for any demand denotes 
the set of facihties with x^p > 0; hence the notation A^(z^) actually represents a dynamic 
set which gets fixed once the partitioning algorithm concludes both Phase 2. On the other 
hand, A^cis(i^) and A^far(^) refer to the close and far neighborhoods at the time when A^(z/) 
is fixed. 

Similar to the algorithm in Section |4| Phase 1 runs in iterations. Fix some iteration 
and consider any client j. As before, N{j) is the neighborhood of j with respect to the yet 
unpartitioned solution, namely the set of facilities yU such that x^j > 0. Order the facilities 
in this set as N{j) = {/ii, with non-decreasing distance from j, that is d^^j < d^^j < 

• • • < d^^j. Without loss of generality, there is an index / for which ^^^i^^^j = 1/7, since 
we can always split one facility to achieve this. Then we define Ncis{j) = {/ii, fii}- (Unlike 
close neighborhoods of demands, Ncis{j) can vary over time.) We also use notation 

tcccisij) = D{Ncis{j),j) = 'j'y] , .'^M^M a^'i dmaxcis(j) = max d^j. 

When the iteration starts, we first find a not-yet-exhausted client p that minimizes the 
value of tcccis(p) + dmaxcis(p) and create a new demand z/ for p. Now we have two cases: 

Case 1 : Nc\s{p) H N{k) 7^ for some existing primary demand k E P. In this case we 
assign u to k. As before, if there are multiple such k, we pick any of them. We also 
fix Xf^i, x^p and x^p for each fi G N{p) fl N{k). Note that although we check for 
overlap between A^cis(p) and N{k), the facilities we actually move into iV(i/) include all 
facilities in the intersection of N{p), a bigger set, with N{k,). 

At this time, the total connection value between u and fi G N{i') is at most I/7, since 
Ylfj,£N{K) V/j- ~ V7 (this follows from the definition of neighborhoods for new primary 
demands in Case 2 below) and we have iV(z/) C N{k,) at this point. Later in Phase 2 
we will add additional facilities from N{p) to iV(z/) to make z/'s total connection value 
equal to 1. 

Case 2 : Ncis{p) H N{k) = for all existing primary demands k E P. In this case we make 
u a primary demand (that is, add it to P) and assign it to itself. We then move the 
facilities from Ncis{p) to N{u), that is for ji G N^xsij)) we set x^y ^ x^p and x^p ^ 0. 

It is easy to see that the total connection value of v to N{v) is now exactly I/7, that 
is 'Yliix^N{u)yiJ^ = 1/7- Moreover, facilities remaining in N{p) are all farther away from 
V than those in N{i/). As we add only facilities from N{p) to N{v) in Phase 2, the 
final Nc\s{v) contains the same set of facilities as the current set N{v). (More precisely, 
A^cis(^) consists of the facilities that either are currently in N{v) or were obtained from 
splitting the facilities currently in N{v).) 

Once all clients are exhausted, that is, each client j has demands created. Phase 1 con- 
cludes. We then run Phase 2, the augmenting phase, following the same steps as in Section |4j 
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For each client j and each demand v E j with total connection value to N{v) less than 1 
(that is, 'Yl,neN{v)^iJ.i^ < l); use our AugmentToUnit() procedure to add additional 
facilities (possibly split, if necessary) from N{j) to N{v) to make the total connection value 
between v and N{v) equal 1. 

This completes the description of the partitioning algorithm. Summarizing, for each 
client j G C we created Vj demands on the same point as j, and we created a number of 
facilities at each site z G F. Thus computed sets of demands and facilities are denoted C and 
F, respectively. For each facility /i G i we defined its fractional opening value |/^, < < 1, 
and for each demand z/ G j we defined its fractional connection value x^^ G {0,?/^}. The 
connections with x^y > define the neighborhood N{v). The facilities in N{v) that are 
closest to V and have total connection value from v equal I/7 form the close neighborhood 
Nc\s{v)i while the remaining facilities in N[v) form the far neighborhood A^far(z^)- It remains 
to show that this partitioning satisfies all the desired properties. 

Correctness of partitioning. We now argue that our partitioned fractional solution [x, y) 
satisfies all the stated properties. Properties (PS), (CO) and (NB) are directly enforced by 
the algorithm. 

(PD' jl| holds because for each primary demand k, E p, A^cis('t) is the same set as N^sip) at 
the time when k was created, and Ncis{p) is removed from N[p) right after this step. Further, 
the partitioning algorithm makes k a primary demand only if Ncis{p) is disjoint from the set 
N{k,') of all existing primary demands k' at that iteration, but these neighborhoods are the 
same as the final close neighborhoods Ncis{i^')- 

The justification of (PD'j2]) is similar to that for (PDj2]) from Section |4j All close neigh- 
borhoods of primary demands are disjoint, due to (PD'jl]), so each facility fj, E i can appear in 
at most one Nc]s{n), for some k E P. Condition (CO) implies that = x^^ for E Nc\s{k,)- 
As a result, the summation on the left-hand side is not larger than J2fi& Vi^ — Vl- 

Regarding (PD', 3(aJ] ), at first glance this property seems to follow directly from the 



algorithm, as we only assign a demand to a primary demand k when iV(z/) at that iteration 
overlaps with N{k) (which is equal to the final value of Nc\s{i^))- However, it is a little more 
subtle, as the final A^cis(^) may contain facilities added to N{v) in Phase 2. Those facilities 
may turn out to be closer to v than some facilities in N{k) fl N{j) (not Nc\s{j)) that we 
added to N{v) in Phase 1. If the final A^cis('^) consists only of facilities added in Phase 2, we 
no longer have the desired overlap of A^cis('«) and iVcis(i^). Luckily this bad scenario never 



occurs. We postpone the proof of this property to Lemma 13 The proof of (PD',3(b)) is 
similar to that of Lemma [5} and we defer it to Lemma 14 



(Sr{T| follows directly from the algorithm because for each demand u E all facilities 
added to N{v) are immediately removed from N{j) and each facility is added to Niy) of 
exactly one demand v E j. Splitting facilities obviously preserves (SPjl]). 

The proof of (Sl'jl]) is similar to that of Lemma |3} li k = u then (Sr|2]) follows from 
(Srjl]), so we can assume that k ^ v. Suppose that v' E j is assigned to k' E P and consider 
the situation after Phase 1. By the way we reassign facilities in Case 1, at this time we have 
N{u) C N{k) = N,i,{k) and N{iy') C N{k') = N,y,{K'), so N{iy') f] NM = 0, by (PD'§. 
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Moreover, we have N{j) fl A^cis('t) = after this iteration, because any facihties that were 
also in Ncis{i^) were removed from N{j) when u was created. In Phase 2, augmentation does 
not change A^cis(/t) and all facilities added to Niy') are from the set N{j) at the end of Phase 
1, which is a subset of the set N{j) after this iteration, since N{j) can only shrink. So the 
condition (SI'M will remain true. 



Lemma 13. Property (PD'.3(a)) holds. 



Proof. Let j be the client for which v & j. We consider an iteration when we create v from 
j and assign it to k, and within this proof, notation Nc\s{j) and N{j) will refer to the value 
of the sets at this particular time. At this time, N{v) is initialized to N{j) fl N{k). Recall 
that N{k,) is now equal to the final Nc\s{i^) (taking into account facility splitting). We would 
like to show that the set Ncis{j) fl A^cisl^^) (which is not empty) will be included in A^cis(^) at 
the end. Technically speaking, this will not be true due to facility splitting, so we need to 
rephrase this claim and the proof in terms of the set of facilities obtained after the algorithm 
completes. 



1/y- 



N(v) 



B 













13 



Let 



Figure 1: Illustration of the sets N{i'), A, B, E and in the proof of Lemma 
X {£ y mean that the facility sets X is obtained from Y by splitting facilities. We then have 
A d iV(j), B d iV,is(j) n NM, E- d iVei,(j) - NM, E+ d N{j) - iV,i,(j). 



We define the sets A, B, E~ and E^ as the subsets of F (the final set of facilities) that 
were obtained from splitting facilities in the sets A^(j), Ncis{j) H A^cis('^)5 Ncis{j) — iVcis('^) 
and N{j) — Xcis(j), respectively. (See Figurejl}) We claim that at the end B C N^sii^), with 
the caveat that the ties in the definition of A^cis(^) are broken in favor of the facilities in B. 
(This is the tie-breaking rule that we mentioned in the definition of A^cis(i^)-) This will be 
sufficient to prove the lemma because i? 7^ 0, by the algorithm. 

We now prove this claim. In this paragraph iV(i/) denotes the final set A^(z/) after both 
phases are completed. Thus the total connection value of A^(z^) to is 1. Note first that 
B C A^(z/) C A, because we never remove facilities from A^(z^) and we only add facilities from 
N{j). Also, B U E~ represents the facilities obtained from Nc\s{j), so '^^^bue- Vi^ ~ '^h- 
This and B C N{v) implies that the total connection value of S U {N{i') fl E~) to v is at 
most 1/7. But all facilities in S U {N{y) fl E~) are closer to v (taking into account our tie 
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breaking in property (NB)) than those in E~^r\N{i'). It follows that B C Ncis{i^), completing 
the proof. □ 



Lemma 14. Property (PD' 3(h}^ holds. 



Proof. This proof is similar to that for Lemma [5j For a client j and demand r/, we will write 
tcc^jg(j) and dmax^jg(j) to denote the values of tcCcis(j) and dmaxcis(j) at the time when rj 
was created. (Here r] may or may not be a demand of client j). 

Suppose z/ G j is assigned to a primary demand k & p. By the way primary demands 
are constructed in the partitioning algorithm, Nchip) becomes N{k), which is equal to the 
final value of Ncis{n)- So we have C^]^^(k) = tcc'^^^^p) and C^^^{k) = dmax[!jg(p). Further, 
since we choose p to minimize tcCcis(p) + dmaXcis(p), we have that tccj!ig(p) + dmax^jg(p) < 
t<k(j) + dmax^i^(j). 

Using an argument analogous to that in the proof of Lemma |4| our modified partitioning 
algorithm guarantees that tccj:jg(j) < tccj^ig(j) < C^s^(i/) and dmax[!ig(j) < dmax^ig(j) < 
C'ds^i^) since z/ was created later. Therefore, we have 

CZ'i^) + C^r^(«:) = tccUp) + dmax,\(p) 

< tcc,\(j) + dmax,\(j) < tcc:,,{j) + dmax.^Jj) < CZ^{u) + C, 



cls 



completing the proof. □ 

Now we have completed the proof that the computed partitioning satisfies all the required 
properties. 

Algorithm EBGS. The complete algorithm starts with solving the LPQ and computing 
the partitioning described earlier in this section. Given the partitioned fractional solution 
{x, y) with the desired properties, we start the process of opening facilities and making 
connections to obtain an integral solution. To this end, for each primary demand /t G P, 
we open exactly one facility in A^cIs(k), where each /i G A^cis('^) is chosen as with 
probability 7?/^. For all facilities /i G F — Uksp ^cis(/t)) we open them independently, each 
with probability 7?/^. 

We claim that all probabilities are well-defined, that is 7?/^ < 1 for all /x. Indeed, if 
1/^ > then = x^y for some by Property (CO). If /i G A^cis('^) then the definition of 
close neighborhoods implies that Xfj^u < I/7. If /i G A^far('^) then x^i^ < 1 — I/7 < I/7, 
because 7 < 2. Thus 71/^ < 1, as claimed. 

Next, we connect demands to facilities. Each primary demand k G P will connect to the 
only open facility 0(fi:) in Nc\s{i^)- For each non-primary demand z/ G C — P, if there is an 
open facility in iVcis(z/) then we connect v to the nearest such facility. Otherwise, we connect 
V to the nearest far facility in iVfar(z/) if one is open. Otherwise, we connect v to its target 
facility 0(k), where k is the primary demand that v is assigned to. 

Analysis. By the algorithm, for each client j, all its demands are connected to open 
facilities. If two different siblings z/, v' G j are assigned, respectively, to primary demands k, 
k' then, by Properties (Sr(T}, (Sr|2|), and (PD'|T} we have 

(iV(z/) U NM) n (N{u') U iVcis («:')) = 0- 
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This condition guarantees that v and v' are assigned to different facilities, regardless whether 
they are connected to a neighbor facility or to its target facility. Therefore the computed 
solution is feasible. 

We now estimate the cost of the solution computed by Algorithm EBGS. The lemma 
below bounds the expected facility cost. 

Lemma 15. The expectation of facility cost F^bgs of Algorithm EBGS is at most 7F*. 

Proof. By the algorithm, each facility /i G F is opened with probability 7?/^, independently 
of whether it belongs to the close neighborhood of a primary demand or not. Therefore, by 
linearity of expectation, we have that the expected facility cost is 

where the third equality follows from (PSjs]). □ 

In the remainder of this section we focus on the connection cost. Let Ci, be the random 
variable representing the connection cost of a demand u. Our objective is to show that the 
expectation of u satisfies 

E[a] < C-'ii^) ■ max jl^^^i/^, 1 + ^| . (11) 

If 1/ is a primary demand then, due to the algorithm, we have E[C,y] = C^]^^(z/) < C"^^^(z/), 



so (11) is easily satisfied. 

Thus for the rest of the argument we will focus on the case when z/ is a non-primary 
demand. Recall that the algorithm connects u to the nearest open facility in Nc\s{i^) if at 
least one facility in A^cis('^) is open. Otherwise the algorithm connects u to the nearest open 
facility in A^far('^), if any. In the event that no facility in A^(z^) opens, the algorithm will 
connect u to its target facility 0(k), where k is the primary demand that u was assigned 
to, and 0(/t) is the only facility open in A^cIs(k)- Let A'^ denote the event that at least one 
facility in A^(z/) is open and AJfj^ be the event that at least one facility in iVcis(z^) is open. 
-lA"^ denotes the complement event of A'^, that is, the event that none of i/'s neighbors opens. 
We want to estimate the following three conditional expectations: 

E[C,\A:,,], E[a I A'^A-A^J, and E[C, \ 

and their associated probabilities. 

We start with a lemma dealing with the third expectation, K[C,y \ -lA'^] = K[d^(^^)^ ' 



A'^]. The proof of this lemma relies on Properties (PD',3(a)) and (PD' |3(b) ) of modified 
partitioning and follows the reasoning in the proof of a similar lemma in [51 For the sake 
of completeness, we include a proof in Appendix A[ 



Lemma 16. Assuming that no facility in Niy) opens, the expected connection cost of v is 

E[a|-A1<C,^:» + 2Q7». (12) 
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Proof. See Appendix A □ 



Next, we derive some estimates for the expected cost of direct connections. The next 



technical lemma is a generalization of Lemma [TOj In Lemma 10 we bound the expected 
distance to the closest open facility in N{v), conditioned on at least one facility in N{v) 
being open. The lemma below provides a similar estimate for an arbitrary set A of facilities 
in N{i>), conditioned on that at least one facility in set A is open. Recall that D{A,i>) = 
E^eA ^iJ-vViJ.! E^eA average distance from i/ to a facility in A. 

Lemma 17. For any non-empty set A C N{v), let he the event that at least one facility 
in A is opened by Algorithm EBGS, and denote by C^{A) the random variable representing 
the distance from v to the closest open facility in A. Then the expected distance from v to 
the nearest open facility in A, conditioned on at least one facility in A being opened, is 



Proof. The proof follows the same reasoning as the proof of Lemma 10, so we only sketch 
it here. We start with a similar grouping of facilities in A: for each primary demand n, if 
Ncis{K)r]A 7^ then Ncis{K)r\A forms a group. Facilities in A that are not in a neighborhood 
of any primary demand form singleton groups. We denote these groups Gi, ...,Gk- It is clear 
that the groups are disjoint because of (PD'|l|. Denoting by dg = D^Gs,^) the average 
distance from z/ to a group Gs, we can assume that these groups are ordered so that di < 
... < 4. 

Each group can have at most one facility open and the events representing opening of 
any two facilities that belong to different groups are independent. To estimate the distance 
from u to the nearest open facility in A, we use an alternative random process to make 
connections, that is easier to analyze. Instead of connecting u to the nearest open facility 
in A, we will choose the smallest s for which Gs has an open facility and connect u to this 
facility. (Thus we selected an open facility with respect to the minimum dg, not the actual 
distance from u to this facility.) This can only increase the expected connection cost, thus 
denoting Qs = E/^eGs ^'^^ all s = 1, . . . , /c, and letting P[A|^] be the probability that A 
has at least one facility open, we have 

E[CM) I Aa] < ^[X^ + ^^2^72(1 -gi) + ... + d^guil -gi)...{l- gu-i)) (13) 

-n^] Ell 9s ^^ =' 

_ Ell ds9s _ Ef,eA dj^ulVt, 
Ell 9s E^eAT^M 

= ^!=i^^ = D(A,z/). 
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Inequality (14) follows from inequality (B.l) in Appendix B The rest of the derivation 
follows from P[A^] = 1 — ns=i(l ^ 9s)i and the definition of ds, Qs and D{A, v). □ 



A consequence of Lemma [T7| is the following corollary which bounds the other two expec- 
tations of Cy, when at least one facility is opened in Nc\s{i')i and when no facility in A^cis('^) 
opens but a facility in A^far(i^) is opened. 



Corollary 18. (a) ¥.[C^ 



Ki.] < CZ'i^), and (b) E[a 



avg/ 
-'far \ 



Proof. When there is an open facility in A^cis(^), the algorithm connect u to the nearest open 
facility in Ndsii^)- When no facility in A^cis(i^) opens but some facility in Nfi^^^u) opens, the 
algorithm connects u to the nearest open facility in A^far(^)- The rest of the proof follows 
from Lemma 17, By setting the set A in Lemma 17 to A^cis('^)5 "we have 



E[aiAc^is]<^(iVcisM,^),=c:i:'M 

proving part (a), and by setting the set A to A^far(i^), we have 



<D{NU^\v) = CZ^{v\ 



which proves part (b). 



□ 



Given the estimate on the three expected distances when v connects to its close facility 
in Nc\s{i') in ([t]), or its far facility in A^farl'^) in ([7|, or its target facility in (12), the 
only missing pieces are estimates on the corresponding probabilities of each event, which we 
do in the next lemma. Once done, we shall put all pieces together and proving the desired 



inequality on E[Cy], that is (11). 

The next Lemma bounds the probabilities for events that no facilities in A^cis(^) 
N{v) are opened by the algorithm. 



and 



Lemma 19. (a) 



nA^iJ < 1/e, and (b) Pi^A*^] < l/e' 



Proof, (a) To estimate P[-'A^ig], we again consider a grouping of facilities in Nc\s{v), as in the 



proof of Lemma [171 according to the primary demand's close neighborhood that they fall in, 
with facilities not belonging to such neighborhoods forming their own singleton groups. As 
before, the groups are denoted Gi, . . . , Gk- It is easy to see that Yl!l=i 9s = J2f_ieN j (u) ~ 
1. For any group Gs, the probability that a facility in this group opens is J2^eGs ^^^m ~ 9s 
because in the algorithm at most one facility in a group can be chosen and each is chosen 
with probability •yy^. Therefore the probability that no facility opens is ns=i(-'- ~ 9s)^ which 
is at most 6"^"=!^^ = l/e. Therefore we have P[-iA^] < l/e. 
(b) This proof is similar to the proof of (a). The probability ^ 



lA'^l is at most e ^»=i 



1/e^, because we now have Y.s=i 9s = 1 T^i^eNM yt^ = 1-^ = 1- 



□ 



We are now ready to bound the overall connection cost of Algorithm EBGS, namely 



inequality (11) 
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Lemma 20. The expected connection of v is 

E[a] < C^"S(z/) ■ max 



1-1/7 ' eT 



Proof. Recall that, to connect u, the algorithm uses the closest facility in A^cis('^) if one is 
opened; otherwise it will try to connect u to the closest facility in A^far(^)- Failing that, it 
will connect u to (j){K), the sole facility open in the neighborhood of n, the primary demand 
u was assigned to. Given that, we estimate E[C^] as follows: 



.A' 



< c: 



avg/ 
els 



clsJ 



(15) 



+ \C:iHi^) + 2C, 



avg, 
far ' 



'c:i^v)+cz^v)].n-^A^ 



\c: 



avg/ 
far 



C. 



avg/ 
els 



els 



(16) 



1 1 

1- - + — 

e eT 



/^avg/ 
■ '-'els I 



1 



Inequality (15) follows from Corollary 18 and Lemma 16 Inequality (16) follows from 



Lemma 19 and C^^iu) - C'Z'^iiy) > 0. 



Now define p = CZ^{v) / C^''^{v) . It is easy to see that p is between and 1. Continuing 

2 



the above derivation, applying (10), we get 



,1/e + l/e^ 



and the proof is now complete. □ 

With Lemma [20] proven, we are now ready to bound our total connection cost. For any 
client j we have 



X 



q. 



Summing over all clients j we obtain that the total expected connection cost is 



< C* max 



1 - 1/7 ' eT 

Recall that the expected facility cost is bounded by 7F*, as argued earlier. Hence the total 
expected cost is bounded by max{7, ^^fZYj^^ 1 + Jf) ' LP*- Picking 7 = 1.575 we obtain the 
desired ratio. 

Theorem 21. Algorithm EBGS is a 1.575- approximation algorithm for FTFP . 
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8. Final Comments 

In this paper we show a sequence of LP-rounding approximation algorithms for FTFP, 
with the best algorithm achieving ratio 1.575. As we mentioned earlier, we believe that 
our techniques of demand reduction and adaptive partitioning are very flexible and should 
be useful in extending other LP-rounding methods for UFL to obtain matching bounds for 
FTFP. 

One of the main open problems in this area is whether FTFL can be approximated with 
the same ratio as UFL, and our work was partly motivated by this question. The techniques 
we introduced are not directly applicable to FTFL, mainly because our partitioning approach 
involves facility splitting that could result in several sibling demands being served by facilities 
on the same site. Nonetheless, we hope that further refinements of our construction might 
get around this issue and lead to new algorithms for FTFL with improved ratios. 
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Appendix A. Proof of Lemma |16| 



Lemma [16] provides a bound on ttie expected connection cost of a demand v wlien Algo- 

(A.l) 



< CZ\u) + 2Q 



far 



rithm EBGS does not open any facilities in N{i'), namely 
We show a stronger inequality that 



avg, 
far ' 



(A.2) 



which then implies (A.l) because C^^^lu) < C^^^lu). The proof of (A.2) is similar to that 
in [2]. For the sake of completeness, we provide it here, formulated in our terminology and 
notation. 

Assume that the event -lA'^ is true, that is Algorithm EBGS does not open any facility 
in A(z/). Let k be the primary demand that u was assigned to. Also let 

K = Nds{K)\N{u), Vds = NcMnN,i,{u) and \4ar = Acis(fi:) n iVfar(z/). 

Then i^', V^is, ^ar form a partition of NcIs{k), that is, they are disjoint and their union is 
NcIs{k). Moreover, we have that K is not empty, because Algorithm EBGS opens some 
facility in Ac1s(k) and this facility cannot be in V^s U Vfar, by our assumption. We also have 



that Vcis is not empty due to (PD',3(a)). 

Recall that D{A,ri) = 'Yl,ii<^A'^myi^l Tlifi&AytJ- average distance between a demand 

rj and the facilities in a set A. We shall show that 



D{K,v)<C^ 



avg, 
els ^ 



avg/ 
far 



(A.3) 



This is sufficient, because, by the algorithm, D{K^ v) is exactly the expected connection cost 
for demand v conditioned on the event that none of z/'s neighbors opens, that is the left-hand 
side of (|A2|). Further, (PD'lsJbJ states that Cll^{ 



thus (A.3) implies (A.2) 



and 



The proof of (A.3) is by analysis of several cases. 



Case 1 : D{K,k) < C, 



avg/ 
els I 



C: 



els 



and d^y < C, 



els 



For any facility ^ G Vc\s (recall that V^is 7^ 0), we have rf^^ < 
Therefore, using the case assumption, we get 



D{K,u) <D{K,K) + d^^ + d^ 



111/ 



els 



avg, 
far ' 



Case 2 : There exists a facility fi G Vds such that < ^cis^i'^)- Since fi G Vds, we infer 



that dfj,i, < C, 



^els 



< a 



avg/ 
far I 



Using C\ 



els 



D{K, k) + d^^ + d^^ < C, 



els 



+ c: 



avg/ 
els 



avg/ 
far 



to bound D{K,k), we have D{K^v) < 



Case 3 : In this case we assume that neither of Cases 1 and 2 applies, that is D{K, k) > 
C^i^^{k) and every /i G Vds satisfies d^^ > C^]^^(k). This implies that D{K U Vds, > 
C'ds^i'^) = -D(Ac1s(k), k). Since sets K, Vds and Vfar form a partition of Ac1s(k), we obtain 

D{V{,„k)>0. 



avg. 



that in this case Vfar is not empty and D(Vfar, /«) < C'ds 
We now have two sub-cases: 



Let 5 
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Case 3.1 : D{Vf^„ v) < C^'^^{u) + 5. Substituting 5, this implies that L'(Vfar, p) + D{yi^„ k) < 
^cis^(^) + Qaj^('^)- From the definition of the average distance D(Vfar, i^) and D(Vfar, v), 
we obtain that there exists some G Vfar such that ci^K + d^i, < C^i^^{k) + Ci^^i^u). 
Thus D{K, v) < DiK, k) + d,^ + d,, < C^^i^) + + Q7(z/). 

Case 3.2 : D(Vfar, > C^l^{v) + 5. The case assumption imphes that Vfar is a proper subset 
of A^far(z^), that is A^far(«^) \ Vfar 7^ 0- Let y = 7Z]^GVfa, ^M- express C^/(z^) 

using y as follows 

CTJi^) = D{V,.r, + D{NU^) \ Vfar, Z/)^^^^^. 

7 — i 7 — i 

Then, using the case condition and simple algebra, we have 



Qr»< WarM\Vfar,Z.) 



7 



1-r 



(A.4) 



where the last step follows from 1 < 7 < 2. 

On the other hand, since Vds-, and Vfar form a partition of Nc\s{f^)i we have C^^{k) 
(1 — y)D{K U Vcis, k) + y-D(Vfar, k)- Then using the definition of 6 we obtain 



D{KUV^,^,n)=CZ%K) + ^ 



(A.5) 



Now we are essentially done. If there exists some fi G Vds such that (i^^ < C^g^(K) + 
yS/{l — y), then we have 



D{K, v) < D{K, k) + d^^ + d 



y6 



cls 



where we used (A.4) in the last step. Otherwise, from (A.5), we must have D{K, k) < 
^cis^i^) + y^/i^ ~ y)- Choosing any /z G Kis, it follows that 



D{K, v) < D{K, k) + d^^ + d 



"flU 



cls 

^avg 



i-y 



again using (A.4) in the last step. 



This concludes the proof of (A.l). As explained earlier, Lemma 16 follows. 
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Appendix B. Proof of Inequality ([5j) 

In Sections [6] and [7] we use the following inequality 

di9i + d2g2{l -gi) + ... + dkQkil - 9i){l - 92] 

1 



;i - 9k) 



(B.i: 



< 



s=l9s 



ELi ds9s) (eLi 9t n*ii(l - 9z)) ■ 



foT < di < d2 < ■ ■ ■ < dk, and < gi, gs < 1. 

We give here a new proof of this inequality, much simpler than the existing proof in [5j, 
and also simpler than the argument by Sviridenko [T7j. We derive this inequality from the 
following generalized version of the Chebyshev Sum Inequality: 



(B.2) 



where each summation runs from 1 to / and the sequences (oj), (fej) and {pi) satisfy the 
following conditions: pi > 0, > 0, 6j > for alH, Oi < 02 < . . . < ai, and 61 > 62 > • • • > 
Given inequality (B.2), we can obtain our inequality ( B.l[ ) by simple substitution 

Pi ^ gi, tti ^ di, hi ^ n*=l(i - gs), 



for i = 1, k. 
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